Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More operators for Databricks Repos #22422

Merged
merged 5 commits into from
Mar 27, 2022

Conversation

alexott
Copy link
Contributor

@alexott alexott commented Mar 22, 2022

This PR adds two additional operators for Databricks Repos:

  • DatabricksReposCreateOperator - to create a new Databricks Repo
  • DatabricksReposDeleteOperator - to delete a Databricks Repo

@mik-laj
Copy link
Member

mik-laj commented Mar 22, 2022

Can this functionality be done without adding a new operator? I am afraid of adding more operators when we do not even have plans on how we want to maintain them and test if they work properly. Commiets check that the change looks good. Sometimes they will look at the documentation and see if it matches what is implemeents, but it may not work properly. What can we do to improve the operational readiness of this provider, since we have recently had more changes in it?

For Snowflake, we decided that we only have one operator, so it's easy to check that it's working properly.
For Google/AWS, we have prepared system tests and we want to run them on CI.

What would we like to do for Databricks? Do you have any plans?

@alexott
Copy link
Contributor Author

alexott commented Mar 22, 2022

@mik-laj These operators are for specific functionality that is available via specific REST API, so alternative will be to get down to low-level API calls. It's a different from the Snowflake connector where you do everything via SQL - for Databricks we have similar DatabricksSqlOperator...

We're planning to build a system test suite that will be used to test all operators.

@alexott alexott force-pushed the databricks-other-repos-operators branch from a5d8d41 to 1fca768 Compare March 22, 2022 11:45
@potiuk
Copy link
Member

potiuk commented Mar 22, 2022

We're planning to build a system test suite that will be used to test all operators.

Yep. We are just about to merge first change from AIP-47 #22311 and we will literally prepare (automatically) issues to convert all the providers. I will ask everyone invlved to comment and will assign them to those issues.

Comment on lines 94 to 97
if repo_path is not None and not self.__repos_path_regexp__.match(repo_path):
raise AirflowException(
f"repo_path should have form of /Repos/{{folder}}/{{repo-name}}, got '{repo_path}'"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since repo_path is a templated field, there is a chance that the arg passed in is a Jinja expression or an XComArg in which case the regexp check would fail with a false-negative result potentially because templated values aren't evaluated until the task begins to execute.

Can you move this to the execute() method scope?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point @josh-fell !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you - I'll fix it later today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@potiuk @josh-fell I've addressed review comments...

@alexott alexott force-pushed the databricks-other-repos-operators branch from c953678 to 5e07b20 Compare March 24, 2022 17:37
@alexott alexott force-pushed the databricks-other-repos-operators branch from e2879c9 to 109e3e3 Compare March 26, 2022 12:25
@potiuk potiuk merged commit 352d7f7 into apache:main Mar 27, 2022
@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Apr 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) kind:documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants